Recently, the tragedy of a Chinese grad student at Chicago University killed in a shooting incident near the campus has raised great concern about gun violence again. Being public health students living in NYC, gaining deeper insights into the incidents of gun violence in the city, such as the overall patterns of location or time and potential association with other public health issues, would not only be beneficial for self-protection, but might also be instrumental for advocating prospective legislative changes to form safe communities in NYC. For this project, our objective is to create: * a report of the spatial, temporal analysis, and potential relation to other factors of the distribution of shooting incidents in NYC, including presenting the potential connection between shooting cases and COVID-19. * a website with all the work and results presented.
Although open carry is not directly banned, New York City prohibits the possession of a “loaded” handgun outside of the home or place of business without a carry license. We usually think NYC are more strict on guns than some states and cities, but for the geographic location and political reasons, gun situation in NYC is much more complicated.
New York City has been the site of many Black Lives Matter protests in response to incidents of police brutality and racially motivated violence against black people, especially during George Floyd protests (May–June 2020).
The political issues have a great influence on gun violence as well, for example the change of mayor and changes of crime police.
Besides, COVID-19 literally changed lifestyle of New Yorkers, which contain the frequency and distribution of shooting cases in NYC.
We have mainly used three datasets:
All datasets were downloaded from NYC open data.
Both historic and year-to-date NYPD shooting incident data contain information about the incident number, occurrence date, time, the demographic characteristics of the perpetrator and victims geographic location, etc. of the shooting cases happen in NYC. Historic data contains cases from 2006 to 2020, and year-to-date data contains cases from January 2021 to September 2021. Data were extracted each quarter by the Office of Management Analysis and Planning. They could serve as a great source for government and police to understand some potential nature behind shooting incidents. Each row represents an unique victim, the same incident key may occur several times, meaning the incident may have multiple victims. There are many values missing for perp_age_group, perp_sex and perp_race, which is due to the fact that the information of the perpetrator was unknown or not available at the time of collection. Key variables used in our analysis are:
incident_key: randomly generated persistent ID for each incident.occur_date: exact date of the shooting incident.occur_time: exact time of the shooting incident.statistical_murder_flag: Shooting resulted in the victim’s death which would be counted as a murder.borough: borough where the shooting incident occurred.perp_age_group: perpetrator’s age within a categoryperp_sex: perpetrator’s sex descriptionperp_race: perpetrator’s race descriptionvic_age_group: victim’s age within a categoryvic_sex: victim’s sex descriptionvic_race: victim’s race descriptionlatitude: latitude coordinate for Global Coordinate System, WGS 1984, decimal degrees (EPSG 4326)longitude: longitude coordinate for Global Coordinate System, WGS 1984, decimal degrees (EPSG 4326)COVID-19 Daily Counts of Cases, Hospitalizations, and Deaths: contains the COVID-related hospitalizations and confirmed, probable death among the whole NYC and each borough. Managed by the NYC Department of Health and Mental Hygiene Incident Command System for COVID-19 Response and is updated everyday. We clean the data by renaming and changing the types of some variables. In addition, we have added a variable borough to show the borough information more explicitly. Key variables are:
month, day, year: specific day, month, and year of COVID-19 diagnosis.borough: borough where the case is confirmed.borough_case_count: total case count of COVID-19 in the borough of a given day.total_case_count: total case count of COVID-19 of a given day in NYC.Set up and import NYPD shooting data and COVID-19 data.
library(tidyverse)
library(rvest)
library(knitr)
library(leaflet)
library(rgdal)
library(lubridate)
library(plotly)
library(modelr)
col1 = "#d8e1cf"
col2 = "#438484"
theme_set(theme_minimal() + theme(legend.position = "bottom"))
options(
ggplot2.continuous.colour = "viridis",
ggplot2.continuous.fill = "viridis"
)
scale_colour_discrete = scale_colour_viridis_d
scale_fill_discrete = scale_fill_viridis_d
shooting_initial =
read_csv("./data/NYPD_Shooting.csv") %>%
janitor::clean_names()
shooting_2021 = read_csv("./data/NYPD_shooting_New.csv") %>%
janitor::clean_names()
covid_counts = read.csv("./data/COVID19_data.csv", sep = ";") %>%
as_tibble()
Cleaning NYPD shooting data:
Format the data to use appropriate variable names and use separate() to break up the variable occur_date into year, month and day. Based on the result of checking null values in each columns below, we filled in missing values with appropriate data name. Also, create character and ordered factors for variables.
#A variable name in shooting_new is different from the initial data, change column name in order to merge the data frames
shooting_2021 = shooting_2021 %>%
rename(lon_lat = new_georeferenced_column)
shooting = rbind(shooting_initial, shooting_2021)
#check null values in each column
shooting %>%
summarise_all(~ sum(is.na(.))) %>% knitr::kable()
| incident_key | occur_date | occur_time | boro | precinct | jurisdiction_code | location_desc | statistical_murder_flag | perp_age_group | perp_sex | perp_race | vic_age_group | vic_sex | vic_race | x_coord_cd | y_coord_cd | latitude | longitude | lon_lat |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 0 | 0 | 0 | 3 | 14626 | 0 | 9106 | 9072 | 9072 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
shooting = shooting %>%
mutate(boro = as.factor(boro)) %>%
mutate(location_desc = replace_na(location_desc, "NONE")) %>%
mutate(location_desc = as.factor(location_desc)) %>%
separate(occur_date, into = c("month", "day", "year")) %>%
mutate(month = as.numeric(month)) %>%
arrange(year, month) %>%
mutate(year = as.character(year)) %>%
mutate(boro = tolower(boro)) %>%
mutate(boro = if_else(boro == "staten island", "staten_island", boro)) %>%
rename(borough = boro) %>%
mutate(date = str_c(month, day, year, sep = "/")) %>%
select(incident_key, date, everything())
Cleaning COVID-19 data:
The clean dataset contains only day-by-day COVID-19 case count for each borough and the total case count in NYC of a particular day. Irrelevant variables are excluded by select(). Borough information and case count in each borough are extracted by pivoting the variables in the raw datasets that contained information of both borough and case count into longer format.
clean_covid = covid_counts %>%
janitor::clean_names() %>%
rename(date = date_of_interest) %>%
select(date, contains("case_count")) %>%
select(-contains(c("probable_case_count", "case_count_7day_avg", "all_case_count_7day_avg"))) %>%
separate(date, into = c("month", "day", "year")) %>%
mutate_all(as.character) %>%
mutate_if(is.character, gsub, pattern = ",", replacement = "") %>%
mutate_if(is.character, as.numeric) %>%
pivot_longer(
cols = bx_case_count:si_case_count,
names_to = "borough",
values_to = "borough_case_count"
) %>%
mutate(borough = gsub("_case_count", "", borough)) %>%
mutate(borough = recode(borough, "bx" = "bronx","bk" = "brooklyn","mn" = "manhattan","si" = "staten_island","qn" = "queens")) %>%
relocate(case_count, .after = borough_case_count) %>%
rename(total_case_count = case_count) %>%
mutate(date = str_c(month, day, year, sep = "/")) %>%
select(date, everything())
shooting_heatmap = shooting_initial %>%
mutate(occur_date = as.Date(occur_date,'%m/%d/%Y')) %>%
mutate(occur_date = weekdays(occur_date)) %>%
separate(occur_time, into = c("hour", "minute", "second")) %>%
mutate(hour = as.factor(hour)) %>%
select(incident_key, occur_date, hour) %>%
mutate(occur_date = as.factor(occur_date),
occur_date = fct_relevel(occur_date, "Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday" , "Saturday"))
dayHour = plyr::ddply(shooting_heatmap, c( "hour", "occur_date"), summarise, N = length(incident_key))
attach(dayHour)
heatmap = ggplot(dayHour, aes(hour, occur_date)) +
geom_tile(aes(fill = N),colour = "white", na.rm = TRUE) +
scale_fill_gradient(low = col1, high = col2) +
guides(fill = guide_legend(title = "Total Shooting Cases")) +
theme_bw() +
theme_minimal() +
labs(title = "Time Based Heatmap",
x = "Shooting Cases Per Hour", y = "Day of Week") +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())
heatmap

According to the result of this heatmap, the midnight of weekends(Sunday and Saturday) have the highest risk of shooting cases and the total shooting cases even reach 500 for an hour. Additionally, daytime between 7 in the morning and 19 in the evening seems to have lower shooting cases than the other time of the day.
It is very similar to the heatmap above, and the total number of shooting cases is very high as well, which means Brooklyn is one of the main shooting happened place in NYC and it kind of a representative shooting incidents distribution in NYC.
heatmap_bn = shooting_initial %>%
filter(boro == "BROOKLYN") %>%
mutate(occur_date = as.Date(occur_date,'%m/%d/%Y')) %>%
mutate(occur_date = weekdays(occur_date)) %>%
separate(occur_time, into = c("hour", "minute", "second")) %>%
mutate(hour = as.factor(hour)) %>%
select(incident_key, occur_date, hour) %>%
mutate(occur_date = as.factor(occur_date),
occur_date = fct_relevel(occur_date, "Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday" , "Saturday"))
dayHour = plyr::ddply(heatmap_bn, c( "hour", "occur_date"), summarise, N = length(incident_key))
attach(dayHour)
heatmap_Bn = ggplot(dayHour, aes(hour, occur_date)) +
geom_tile(aes(fill = N),colour = "white", na.rm = TRUE) +
scale_fill_gradient(low = col1, high = col2) +
guides(fill = guide_legend(title = "Total Shooting Cases")) +
theme_bw() +
theme_minimal() +
labs(title = "Time Based Heatmap in Brooklyn",
x = "Shooting Cases Per Hour", y = "Day of Week") +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())
heatmap_Bn

We can see that there is a slight difference between Brooklyn and the whole New York City, which is the total shooting cases decrease apparently. The highest case is 150 but in the total heatmap reach 500. Weekends(Sunday and Saturday) midnight still have the highest risk of shooting cases.
heatmap_bx = shooting_initial %>%
filter(boro == "BRONX") %>%
mutate(occur_date = as.Date(occur_date,'%m/%d/%Y')) %>%
mutate(occur_date = weekdays(occur_date)) %>%
separate(occur_time, into = c("hour", "minute", "second")) %>%
mutate(hour = as.factor(hour)) %>%
select(incident_key, occur_date, hour) %>%
mutate(occur_date = as.factor(occur_date),
occur_date = fct_relevel(occur_date, "Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday" , "Saturday"))
dayHour = plyr::ddply(heatmap_bx, c( "hour", "occur_date"), summarise, N = length(incident_key))
attach(dayHour)
heatmap_Bx = ggplot(dayHour, aes(hour, occur_date)) +
geom_tile(aes(fill = N),colour = "white", na.rm = TRUE) +
scale_fill_gradient(low = col1, high = col2) +
guides(fill = guide_legend(title = "Total Shooting Cases")) +
theme_bw() +
theme_minimal() +
labs(title = "Time Based Heatmap in Bronx",
x = "Shooting Cases Per Hour", y = "Day of Week") +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())
heatmap_Bx

The distribution of shooting incidents is similiar but the number of cases is below 100 in queens. Also, you can have morning jog on Wednesday and Tuesday without any stress at 6(just kiding).
heatmap_q = shooting_initial %>%
filter(boro == "QUEENS") %>%
mutate(occur_date = as.Date(occur_date,'%m/%d/%Y')) %>%
mutate(occur_date = weekdays(occur_date)) %>%
separate(occur_time, into = c("hour", "minute", "second")) %>%
mutate(hour = as.factor(hour)) %>%
select(incident_key, occur_date, hour) %>%
mutate(occur_date = as.factor(occur_date),
occur_date = fct_relevel(occur_date, "Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday" , "Saturday"))
dayHour = plyr::ddply(heatmap_q, c( "hour", "occur_date"), summarise, N = length(incident_key))
attach(dayHour)
heatmap_q = ggplot(dayHour, aes(hour, occur_date)) +
geom_tile(aes(fill = N),colour = "white", na.rm = TRUE) +
scale_fill_gradient(low = col1, high = col2) +
guides(fill = guide_legend(title = "Total Shooting Cases")) +
theme_bw() +
theme_minimal() +
labs(title = "Time Based Heatmap in Queens",
x = "Shooting Cases Per Hour", y = "Day of Week") +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())
heatmap_q

Weekends still have higher risk and the highest total shooting cases drop to 80.
heatmap_m = shooting_initial %>%
filter(boro == "MANHATTAN") %>%
mutate(occur_date = as.Date(occur_date,'%m/%d/%Y')) %>%
mutate(occur_date = weekdays(occur_date)) %>%
separate(occur_time, into = c("hour", "minute", "second")) %>%
mutate(hour = as.factor(hour)) %>%
select(incident_key, occur_date, hour) %>%
mutate(occur_date = as.factor(occur_date),
occur_date = fct_relevel(occur_date, "Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday" , "Saturday"))
dayHour = plyr::ddply(heatmap_m, c( "hour", "occur_date"), summarise, N = length(incident_key))
attach(dayHour)
heatmap_m = ggplot(dayHour, aes(hour, occur_date)) +
geom_tile(aes(fill = N),colour = "white", na.rm = TRUE) +
scale_fill_gradient(low = col1, high = col2) +
guides(fill = guide_legend(title = "Total Shooting Cases")) +
theme_bw() +
theme_minimal() +
labs(title = "Time Based Heatmap in Manhattan",
x = "Shooting Cases Per Hour", y = "Day of Week") +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())
heatmap_m

Obviously, this is the safest borough considering shooting cases. The highest total shooting number is 25. The highest risk time is 3 to 4 a.m. on Saturday. Monday evening at 19 is high risk as well except weekends.
heatmap_l = shooting_initial %>%
filter(boro == "STATEN ISLAND") %>%
mutate(occur_date = as.Date(occur_date,'%m/%d/%Y')) %>%
mutate(occur_date = weekdays(occur_date)) %>%
separate(occur_time, into = c("hour", "minute", "second")) %>%
mutate(hour = as.factor(hour)) %>%
select(incident_key, occur_date, hour) %>%
mutate(occur_date = as.factor(occur_date),
occur_date = fct_relevel(occur_date, "Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday" , "Saturday"))
dayHour = plyr::ddply(heatmap_l, c( "hour", "occur_date"), summarise, N = length(incident_key))
attach(dayHour)
heatmap_l = ggplot(dayHour, aes(hour, occur_date)) +
geom_tile(aes(fill = N),colour = "white", na.rm = TRUE) +
scale_fill_gradient(low = col1, high = col2) +
guides(fill = guide_legend(title = "Total Shooting Cases")) +
theme_bw() +
theme_minimal() +
labs(title = "Time Based Heatmap in Staten Island",
x = "Shooting Cases Per Hour", y = "Day of Week") +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())
heatmap_l

For the shooting incidents across time, we use three different levels to analyze. Firstly, we compared shooting case year by year. ### Distribution of Shooting Case of Years
shooting_year = shooting %>%
group_by(year) %>%
summarise(n_obs = n())
#visualization shooting incidence trend
shooting_year %>%
plot_ly( x = ~year, y = ~n_obs, type = "scatter", mode = "lines+markers") %>%
layout(title = "Shooting Incidence Trend from 2006 to 2021",
xaxis = list(title = "Year"),
yaxis = list(title = "Frequency"))
By observing the data set, the shooting incidence gradually decrease from 2055 cases in 2006 to 967 cases in 2019. However, due to the Covid-19 pandemic and responses to large-scale protests over the killing of George Floyd, there is a sharp surge of shooting incidents in 2020 which have 1948 cases. Since the data for 2021 is only from January to September 30th, we are not sure whether there is a decrease in the year 2021 compared to 2020.
Then we take a look at average shooting cases between months from 2006 to 2021 in New York City.
shooting_month = shooting %>%
mutate(month = as.factor(month)) %>%
group_by(month) %>%
summarise(n_obs = n())
shooting_month %>%
plot_ly(x = ~month, y = ~n_obs, color = ~month, type = "bar") %>%
layout( title = "The Distribution of Shooting Incidence by Month",
xaxis = list(title = "Month"),
yaxis = list(title = "Frequency"))
The distribution of the shooting incidence by month has a bell shape. The shooting case concentrated in summer from May to September. The reason for this may be related to the time of memorial day and labor day.
shooting_time = shooting %>%
##format the occur_time variable to only hours.
mutate(occur_time_hour = format(as.POSIXct(occur_time), format = "%H")) %>%
mutate(occur_time_hour = as.numeric(occur_time_hour)) %>%
group_by(occur_time_hour) %>%
summarise(case_numb = n())
#divide day time to 4 groups: 0-6;6-12;12-18;18-24
shooting_time = shooting_time %>%
mutate(occur_time_range = case_when(
occur_time_hour >= 0 & occur_time_hour < 6 ~ "0-6",
occur_time_hour >= 6 & occur_time_hour < 12 ~ "6-12",
occur_time_hour >= 12 & occur_time_hour < 18 ~ "12-18",
occur_time_hour >= 18 & occur_time_hour < 24 ~ "18-24"))
shooting_time = shooting_time %>%
mutate(occur_time_range = factor(occur_time_range, levels = c("0-6","6-12","12-18","18-24")))
ggplot(shooting_time, aes(x = occur_time_range, y = case_numb, fill = occur_time_range)) + geom_col(alpha = 1) + labs(x = "Occur Time Range",
y = "Frequency",
title = "Distribution of Shooting Case by Day")

#pie chart of ratio of shooting cases in a day
shooting_time2 = shooting %>%
mutate(occur_time_hour = format(as.POSIXct(occur_time), format = "%H")) %>%
mutate(occur_time_hour = as.numeric(occur_time_hour)) %>%
group_by(occur_time_hour) %>%
add_count(incident_key, name = "n_shooting") %>%
distinct() %>%
mutate(occur_time_range = case_when(
occur_time_hour >= 0 & occur_time_hour < 6 ~ "0-6",
occur_time_hour >= 6 & occur_time_hour < 12 ~ "6-12",
occur_time_hour >= 12 & occur_time_hour < 18 ~ "12-18",
occur_time_hour >= 18 & occur_time_hour < 24 ~ "18-24"))
shooting_time2 = shooting_time2 %>%
mutate(occur_time_range = factor(occur_time_range, levels = c("0-6","6-12","12-18","18-24"))) %>% add_count(occur_time_range, wt = n_shooting, name = "total_n_shooting")
shooting_time2 = shooting_time2 %>%
group_by(occur_time_range) %>% # Variable to be transformed
count() %>%
mutate(total_n = c(1537 + 4394 + 9232 + 9953)) %>%
mutate(perc = `n` / `total_n`) %>%
arrange(perc) %>%
mutate(labels = scales::percent(perc))
ggplot(shooting_time2, aes(x = "", y = n, fill = occur_time_range)) +
geom_bar(width = 1, stat = "identity") +
geom_label(aes(label = labels),
position = position_stack(vjust = 0.5),
show.legend = FALSE) +
coord_polar("y", start = 0) +
scale_fill_brewer(palette = "Pastel1") +
theme_void() +
guides(fill = guide_legend(title = "occur Time range")) +
labs(title = "Pie Chart for Distribution of Shooting Case by Day") +
theme(legend.position = "right")

Most shooting cases happen in the evening and late at night, concentrated in 18-24 and 0-6. The pie chart clearly shows the occupy of shootings time range take place in a day.
shooting_time_hour = shooting %>%
##format the occur_time variable to only hours.
mutate(occur_time_hour = format(as.POSIXct(occur_time), format = "%H")) %>%
mutate(occur_time_hour = as.numeric(occur_time_hour)) %>%
group_by(occur_time_hour,vic_sex,borough,vic_race,vic_age_group,perp_sex,perp_race) %>%
add_count(borough, name = "n_shooting") %>%
summarise(case_numb = n())
## `summarise()` has grouped output by 'occur_time_hour', 'vic_sex', 'borough', 'vic_race', 'vic_age_group', 'perp_sex'. You can override using the `.groups` argument.
#divide day time to 4 groups: 0-6;6-12;12-18;18-24
shooting_time_hour = shooting_time_hour %>%
mutate(occur_time_range = case_when(
occur_time_hour >= 0 & occur_time_hour < 6 ~ "0-6",
occur_time_hour >= 6 & occur_time_hour < 12 ~ "6-12",
occur_time_hour >= 12 & occur_time_hour < 18 ~ "12-18",
occur_time_hour >= 18 & occur_time_hour < 24 ~ "18-24"))
shooting_time_hour = shooting_time_hour %>%
mutate(occur_time_range = factor(occur_time_range, levels = c("0-6","6-12","12-18","18-24")))
ggplot(shooting_time_hour, aes(x = occur_time_range, y = case_numb, fill = vic_sex)) + geom_bar(stat = "identity", position = position_dodge()) + labs(x = "Occur Time Range",
y = "Frequency",
title = "Distribution of Shooting Case Across Time-Range Between Vic_Sex")

We observed that no matter in what time range, male victims are overwhelmingly higher than female victims and this is due to more males involved in shooting cases. There is no obvious difference in number between time range for female victims, which means females are vulnerable all the time. For males, there is a distinct raise from 6-12 to 18-24.
We would like to focus the gun violence in the year 2020 as a critical year of a surge. (Since the covid-19 starts from March 2020, to see if there is any relation between shooting and COVID.)
shooting_2020 = shooting %>%
filter(year == "2020") %>%
mutate(month = as.factor(month)) %>%
group_by(month) %>%
summarise(n_obs = n())
ggplot(shooting_2020, aes(x = month, y = n_obs, fill = month)) + geom_col(alpha = 1) + labs(
x = "Month",
y = "Frequency",
title = "Distribution of Shooting Case across month in 2020 in NYC")
The major rise in gun violence in the city began in 2020, after a period in which violent crime dropped to its lowest levels in more than six decades. For the first half of the year, Gun violence is relatively low as the intact of shelter-in-place orders and social distancing requirements in the pandemic. Yet starting in the summer, shooting cases reach the spike during June to August, which may also be fueled by the death of George Floyd.
Since 2020 is the critical year, we would like to analyze the average shooting case by month between 2019 and 2020.
shooting_2019_2020 = shooting %>%
filter(year == "2019" | year == "2020") %>%
mutate(month = as.factor(month)) %>%
group_by(year, month) %>%
summarise(frequency = n())
ggplot(shooting_2019_2020, aes(x = month, y = frequency, fill = year)) + geom_bar(stat = "identity", position = position_dodge(), alpha = 0.75) + labs(
x = "Month",
y = "Frequency",
title = "Shooting Case Across Month in NYC in 2019 & 2020")
Besides year-on-year growth between 2019 to 2020, the distribution of shooting case across the month is the same, which reinforce that the number of shooting cases is high in summer. On the other side, it clearly indicates that gun violence surged after March(the time that CDC announced the pandemic). This rise is that the pandemic aggravated the very factors driving gun violence.
After almost one and a half-decade decrease in shooting cases in New York City, there is a sharp growth in the year 2020. The collision of these COVID-19 pandemics and gun shoot offers possibility amid great loss. There was hope that shelter-in-place order would not only reduce the spread of COVID-19 but also help reduce city-based gun violence. However, under the prolonged emotional and financial stress, shootings increased to record levels in 2020.
The maps below provide general information of the shooting incidents in New York City, including the geolocation, total number of incidents in an area, and background information. GIS data of Borough Boundaries were obtained from NYC open data.
shooting_map = shooting %>%
mutate_at(c("perp_age_group", "perp_sex", "perp_race"), funs(ifelse(is.na(.), "UNKNOWN", .))) %>%
mutate(labels = str_c("<b>Incident Key: </b>", incident_key,
"<br>", "<b>Date: </b>", date,
"<br>", "<b>Borough: </b>", borough,
"<br>", "<b>Murdered: </b>", statistical_murder_flag,
"<br>", "<b>Perpetrator's Race: </b>", perp_race,
"<br>", "<b>Victim's Race: </b>", vic_race,
"<br>", "<b>Perpetrator's Age: </b>", perp_age_group,
"<br>", "<b>Victim's Age: </b>", vic_age_group
))
nyc_boro = readOGR("./data/Borough_Boundaries/geo_export_2204bc6b-9c17-46ed-8a67-7245a1e15877.shp", layer = "geo_export_2204bc6b-9c17-46ed-8a67-7245a1e15877", verbose = FALSE)
polygon_color <- colorFactor(
palette = "viridis",
domain = as.factor(nyc_boro@data$boro_name))
shooting_map %>%
leaflet() %>%
addTiles() %>%
addProviderTiles("CartoDB.Positron") %>%
addMarkers(lng = ~longitude, lat = ~latitude, popup = ~labels,
clusterOptions = markerClusterOptions()) %>%
addPolygons(data = nyc_boro,
weight = 0.85,
fillColor = ~polygon_color(nyc_boro@data$boro_name),
# fillOpacity = 0.6,
color = "#BDBDC3",
label = ~nyc_boro@data$boro_name)